8 research outputs found

    SQ Lower Bounds for Learning Bounded Covariance GMMs

    Full text link
    We study the complexity of learning mixtures of separated Gaussians with common unknown bounded covariance matrix. Specifically, we focus on learning Gaussian mixture models (GMMs) on Rd\mathbb{R}^d of the form P=i=1kwiN(μi,Σi)P= \sum_{i=1}^k w_i \mathcal{N}(\boldsymbol \mu_i,\mathbf \Sigma_i), where Σi=ΣI\mathbf \Sigma_i = \mathbf \Sigma \preceq \mathbf I and minijμiμj2kϵ\min_{i \neq j} \| \boldsymbol \mu_i - \boldsymbol \mu_j\|_2 \geq k^\epsilon for some ϵ>0\epsilon>0. Known learning algorithms for this family of GMMs have complexity (dk)O(1/ϵ)(dk)^{O(1/\epsilon)}. In this work, we prove that any Statistical Query (SQ) algorithm for this problem requires complexity at least dΩ(1/ϵ)d^{\Omega(1/\epsilon)}. In the special case where the separation is on the order of k1/2k^{1/2}, we additionally obtain fine-grained SQ lower bounds with the correct exponent. Our SQ lower bounds imply similar lower bounds for low-degree polynomial tests. Conceptually, our results provide evidence that known algorithms for this problem are nearly best possible

    Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise

    Full text link
    We study the problem of PAC learning γ\gamma-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms. Concretely, the sample complexity of the problem is Θ~(1/(γ2ϵ))\widetilde{\Theta}(1/(\gamma^2 \epsilon)). We start by giving a simple efficient algorithm with sample complexity O~(1/(γ2ϵ2))\widetilde{O}(1/(\gamma^2 \epsilon^2)). Our main result is a lower bound for Statistical Query (SQ) algorithms and low-degree polynomial tests suggesting that the quadratic dependence on 1/ϵ1/\epsilon in the sample complexity is inherent for computationally efficient algorithms. Specifically, our results imply a lower bound of Ω~(1/(γ1/2ϵ2))\widetilde{\Omega}(1/(\gamma^{1/2} \epsilon^2)) on the sample complexity of any efficient SQ learner or low-degree test

    Learning general halfspaces with general Massart noise under the Gaussian distribution

    No full text
    We study the problem of PAC learning halfspaces on Rd\mathbb{R}^d with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point x\mathbf{x} with unknown probability η(x)η\eta(\mathbf{x}) \leq \eta, for some parameter η[0,1/2]\eta \in [0,1/2]. The goal is to find a hypothesis with misclassification error of OPT+ϵ\mathrm{OPT} + \epsilon, where OPT\mathrm{OPT} is the error of the target halfspace. This problem had been previously studied under two assumptions: (i) the target halfspace is homogeneous (i.e., the separating hyperplane goes through the origin), and (ii) the parameter η\eta is strictly smaller than 1/21/2. Prior to this work, no nontrivial bounds were known when either of these assumptions is removed. We study the general problem and establish the following: For η<1/2\eta <1/2, we give a learning algorithm for general halfspaces with sample and computational complexity dOη(log(1/γ))poly(1/ϵ)d^{O_{\eta}(\log(1/\gamma))}\mathrm{poly}(1/\epsilon), where γ=max{ϵ,min{Pr[f(x)=1],Pr[f(x)=1]}}\gamma =\max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \} is the bias of the target halfspace ff. Prior efficient algorithms could only handle the special case of γ=1/2\gamma = 1/2. Interestingly, we establish a qualitatively matching lower bound of dΩ(log(1/γ))d^{\Omega(\log(1/\gamma))} on the complexity of any Statistical Query (SQ) algorithm. For η=1/2\eta = 1/2, we give a learning algorithm for general halfspaces with sample and computational complexity Oϵ(1)dO(log(1/ϵ))O_\epsilon(1) d^{O(\log(1/\epsilon))}. This result is new even for the subclass of homogeneous halfspaces; prior algorithms for homogeneous Massart halfspaces provide vacuous guarantees for η=1/2\eta=1/2. We complement our upper bound with a nearly-matching SQ lower bound of dΩ(log(1/ϵ))d^{\Omega(\log(1/\epsilon))}, which holds even for the special case of homogeneous halfspaces.Comment: Revised presentatio
    corecore